智能论文笔记

Handwritten text generation and strikethrough characters augmentation

Alex Shonenkov , Denis Karachev , Max Novopoltsev , Mark Potanin , Denis Dimitrov , Andrey Chertok

分类：计算机视觉

2021-12-14

我们介绍了两个数据增强技术，它与Reset-Bilstm-CTC网络一起使用，显着降低了在手写文本识别（HTR）任务上的最佳报告结果之外的字错误率（WER）和字符错误率（CER）。我们应用了一种基于打印文本（StackMix）的删除文本（手写污染）和手写文本生成方法的新型增强，这被证明在HTR任务中非常有效。StackMix使用弱监督框架来获得字符边界。因为这些数据增强技术与所使用的网络无关，所以也可以应用于增强其他网络的性能和HTR的方法。十个手写文本数据集的广泛实验表明，手写墨水增强和StackMix显着提高了HTR模型的质量

translated by 谷歌翻译

Many Heads but One Brain: an Overview of Fusion Brain Challenge on AI Journey 2021

Daria Bakshandaeva , Denis Dimitrov , Alex Shonenkov , Mark Potanin , Vladimir Arkhipkin , Denis Karachev , Vera Davydova , Anton Voronov , Mikhail Martynov , Natalia Semenova

分类：计算机视觉 | 人工智能 | 自然语言处理

2021-11-22

支持II社区的当前趋势，我们提出了一个称为融合大脑的AI Journey 2021挑战，这些挑战是融合大脑，该挑战是使普通架构处理不同的方式（即图像，文本和代码），并解决视觉和语言的多个任务。融合脑挑战https://github.com/sberbank- ai/fusion_brain_aij2021结合了以下特定任务：code2code翻译，手写文本识别，零拍摄对象检测和视觉问题应答。我们为每个任务创建了数据集以测试参与者的提交。此外，我们在俄语和英语中开设了一个新的手写数据集，其中包含94,130对图像和文本。DataSet的俄罗斯部分是世界上最大的俄罗斯手写数据集。我们还提出了基线解决方案和相应的特定于任务特定解决方案以及整体指标。

translated by 谷歌翻译

Machine Learning with Probabilistic Law Discovery: A Concise Introduction

Alexander Demin , Denis Ponomaryov

分类：人工智能

2022-12-22

Probabilistic Law Discovery (PLD) is a logic based Machine Learning method, which implements a variant of probabilistic rule learning. In several aspects, PLD is close to Decision Tree/Random Forest methods, but it differs significantly in how relevant rules are defined. The learning procedure of PLD solves the optimization problem related to the search for rules (called probabilistic laws), which have a minimal length and relatively high probability. At inference, ensembles of these rules are used for prediction. Probabilistic laws are human-readable and PLD based models are transparent and inherently interpretable. Applications of PLD include classification/clusterization/regression tasks, as well as time series analysis/anomaly detection and adaptive (robotic) control. In this paper, we outline the main principles of PLD, highlight its benefits and limitations and provide some application guidelines.

translated by 谷歌翻译

Nonparametric plug-in classifier for multiclass classification of S.D.E. paths

Christophe Denis , Charlotte Dion-Blanc , Eddy Ella Mintsa , Viet-Chi Tran

分类： (统计)机器学习

2022-12-20

We study the multiclass classification problem where the features come from the mixture of time-homogeneous diffusions. Specifically, the classes are discriminated by their drift functions while the diffusion coefficient is common to all classes and unknown. In this framework, we build a plug-in classifier which relies on nonparametric estimators of the drift and diffusion functions. We first establish the consistency of our classification procedure under mild assumptions and then provide rates of cnvergence under different set of assumptions. Finally, a numerical study supports our theoretical findings.

translated by 谷歌翻译

Towards leveraging latent knowledge and Dialogue context for real-world conversational question answering

Shaomu Tan , Denis Paperno

分类：自然语言处理

2022-12-17

In many real-world scenarios, the absence of external knowledge source like Wikipedia restricts question answering systems to rely on latent internal knowledge in limited dialogue data. In addition, humans often seek answers by asking several questions for more comprehensive information. As the dialog becomes more extensive, machines are challenged to refer to previous conversation rounds to answer questions. In this work, we propose to leverage latent knowledge in existing conversation logs via a neural Retrieval-Reading system, enhanced with a TFIDF-based text summarizer refining lengthy conversational history to alleviate the long context issue. Our experiments show that our Retrieval-Reading system can exploit retrieved background knowledge to generate significantly better answers. The results also indicate that our context summarizer significantly helps both the retriever and the reader by introducing more concise and less noisy contextual information.

translated by 谷歌翻译

Efficient Long Sequence Modeling via State Space Augmented Transformer

Simiao Zuo , Xiaodong Liu , Jian Jiao , Denis Charles , Eren Manavoglu , Tuo Zhao , Jianfeng Gao

分类：自然语言处理 | 机器学习

2022-12-15

Transformer models have achieved superior performance in various natural language processing tasks. However, the quadratic computational cost of the attention mechanism limits its practicality for long sequences. There are existing attention variants that improve the computational efficiency, but they have limited ability to effectively compute global information. In parallel to Transformer models, state space models (SSMs) are tailored for long sequences, but they are not flexible enough to capture complicated local information. We propose SPADE, short for $\underline{\textbf{S}}$tate s$\underline{\textbf{P}}$ace $\underline{\textbf{A}}$ugmente$\underline{\textbf{D}}$ Transform$\underline{\textbf{E}}$r. Specifically, we augment a SSM into the bottom layer of SPADE, and we employ efficient local attention methods for the other layers. The SSM augments global information, which complements the lack of long-range dependency issue in local attention methods. Experimental results on the Long Range Arena benchmark and language modeling tasks demonstrate the effectiveness of the proposed method. To further demonstrate the scalability of SPADE, we pre-train large encoder-decoder models and present fine-tuning results on natural language understanding and natural language generation tasks.

translated by 谷歌翻译

Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue Systems

Denis Emelin , Daniele Bonadiman , Sawsan Alqahtani , Yi Zhang , Saab Mansour

分类：自然语言处理 | 人工智能

2022-12-15

Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data. Previous studies augmented PLMs with symbolic knowledge for different downstream NLP tasks. However, knowledge bases (KBs) utilized in these studies are usually large-scale and static, in contrast to small, domain-specific, and modifiable knowledge bases that are prominent in real-world task-oriented dialogue (TOD) systems. In this paper, we showcase the advantages of injecting domain-specific knowledge prior to fine-tuning on TOD tasks. To this end, we utilize light-weight adapters that can be easily integrated with PLMs and serve as a repository for facts learned from different KBs. To measure the efficacy of proposed knowledge injection methods, we introduce Knowledge Probing using Response Selection (KPRS) -- a probe designed specifically for TOD models. Experiments on KPRS and the response generation task show improvements of knowledge injection with adapters over strong baselines.

translated by 谷歌翻译

Neural Assets: Volumetric Object Capture and Rendering for Interactive Environments

Aljaž Božič , Denis Gladkov , Luke Doukakis , Christoph Lassner

分类：计算机视觉 | 人工智能

2022-12-12

Creating realistic virtual assets is a time-consuming process: it usually involves an artist designing the object, then spending a lot of effort on tweaking its appearance. Intricate details and certain effects, such as subsurface scattering, elude representation using real-time BRDFs, making it impossible to fully capture the appearance of certain objects. Inspired by the recent progress of neural rendering, we propose an approach for capturing real-world objects in everyday environments faithfully and fast. We use a novel neural representation to reconstruct volumetric effects, such as translucent object parts, and preserve photorealistic object appearance. To support real-time rendering without compromising rendering quality, our model uses a grid of features and a small MLP decoder that is transpiled into efficient shader code with interactive framerates. This leads to a seamless integration of the proposed neural assets with existing mesh environments and objects. Thanks to the use of standard shader code rendering is portable across many existing hardware and software systems.

translated by 谷歌翻译

A Neural Network Approach for Selecting Track-like Events in Fluorescence Telescope Data

Mikhail Zotov , Denis Sokolinskii

分类：机器学习

2022-12-07

In 2016-2017, TUS, the world's first experiment for testing the possibility of registering ultra-high energy cosmic rays (UHECRs) by their fluorescent radiation in the night atmosphere of Earth was carried out. Since 2019, the Russian-Italian fluorescence telescope (FT) Mini-EUSO ("UV Atmosphere") has been operating on the ISS. The stratospheric experiment EUSO-SPB2, which will employ an FT for registering UHECRs, is planned for 2023. We show how a simple convolutional neural network can be effectively used to find track-like events in the variety of data obtained with such instruments.

translated by 谷歌翻译

Efficient Optimization with Higher-Order Ising Machines

Connor Bybee , Denis Kleyko , Dmitri E. Nikonov , Amir Khosrowshahi , Bruno A. Olshausen , Friedrich T. Sommer

分类：神经与进化计算

2022-12-07

A prominent approach to solving combinatorial optimization problems on parallel hardware is Ising machines, i.e., hardware implementations of networks of interacting binary spin variables. Most Ising machines leverage second-order interactions although important classes of optimization problems, such as satisfiability problems, map more seamlessly to Ising networks with higher-order interactions. Here, we demonstrate that higher-order Ising machines can solve satisfiability problems more resource-efficiently in terms of the number of spin variables and their connections when compared to traditional second-order Ising machines. Further, our results show on a benchmark dataset of Boolean \textit{k}-satisfiability problems that higher-order Ising machines implemented with coupled oscillators rapidly find solutions that are better than second-order Ising machines, thus, improving the current state-of-the-art for Ising machines.

translated by 谷歌翻译